Morpheme- and POS-based IBM1 scores and language model scores for translation quality estimation
نویسنده
چکیده
We present a method we used for the quality estimation shared task of WMT 2012 involving IBM1 and language model scores calculated on morphemes and POS tags. The IBM1 scores calculated on morphemes and POS-4grams of the source sentence and obtained translation output are shown to be competitive with the classic evaluation metrics for ranking of translation systems. Since these scores do not require any reference translations, they can be used as features for the quality estimation task presenting a connection between the source language and the obtained target language. In addition, target language model scores of morphemes and POS tags are investigated as estimates for the obtained target language quality.
منابع مشابه
Morpheme- and POS-based IBM1 and language model scores for translation quality estimation
We present a method we used for the quality estimation shared task of WMT 2012 involving IBM1 and language model scores calculated on morphemes and POS tags. The IBM1 scores calculated on morphemes and POS-4grams of the source sentence and obtained translation output are shown to be competitive with the classic evaluation metrics for ranking of translation systems. Since these scores do not req...
متن کاملEvaluation without references: IBM1 scores as evaluation metrics
Current metrics for evaluating machine translation quality have the huge drawback that they require human-quality reference translations. We propose a truly automatic evaluation metric based on IBM1 lexicon probabilities which does not need any reference translations. Several variants of IBM1 scores are systematically explored in order to find the most promising directions. Correlations between...
متن کاملIntelligent Hybrid Man-Machine Translation Quality Estimation
Inferring evaluation scores based on human judgments is invaluable compared to using current evaluation metrics which are not suitable for real-time applications e.g. post-editing. However, these judgments are much more expensive to collect especially from expert translators, compared to evaluation based on indicators contrasting source and translation texts. This work introduces a novel approa...
متن کاملEnriching Phrase-Based Statistical Machine Translation with POS Information
This work presents an extension to phrasebased statistical machine translation models which incorporates linguistic knowledge, namely part-of-speech information. Scores are added to the standard phrase table which represent how the phrases correspond to their translations on the partof-speech level. We suggest two different kinds of scores. They are learned from a POS-tagged version of the para...
متن کاملSimultaneous Word-Morpheme Alignment for Statistical Machine Translation
Current word alignment models for statistical machine translation do not address morphology beyond merely splitting words. We present a two-level alignment model that distinguishes between words and morphemes, in which we embed an IBM Model 1 inside an HMM based word alignment model. The model jointly induces word and morpheme alignments using an EM algorithm. We evaluated our model on Turkish-...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2012